Skip to content

perf(iterator): pre-seed waste pool with embedded Item slots#2294

Open
shaunpatterson wants to merge 1 commit into
dgraph-io:mainfrom
shaunpatterson:perf/iterator-prefilled-items
Open

perf(iterator): pre-seed waste pool with embedded Item slots#2294
shaunpatterson wants to merge 1 commit into
dgraph-io:mainfrom
shaunpatterson:perf/iterator-prefilled-items

Conversation

@shaunpatterson
Copy link
Copy Markdown

Summary

Iterator.prefetch() with PrefetchValues=false uses prefetchSize=2, so the first two newItem() calls on a fresh iterator always heap-allocate before the per-iterator waste recycling kicks in. For workloads like dgraph posting-list rollup (NewKeyIterator, AllVersions=true, PrefetchValues=false) that re-create iterators per posting list, this is 2 allocs/op of pure churn.

Embed [2]Item directly in the Iterator struct and push them onto the waste pool at construction. The first two newItem() pops now return these embedded slots; only iterators that demand more items (PrefetchValues=true with PrefetchSize>2) fall back to allocating.

Safety

Iterator is already only used through *Iterator and never copied by value, so the sync.WaitGroup inside Item is safe to embed. txn is assigned to each prefilled item to mirror what newItem does for heap-allocated items.

Benchmark — BenchmarkRollupKeyIterator (Apple M4 Max)

ns/op B/op allocs/op
before 829.5 770 13
after ~800 754 11

Test plan

  • go test ./... passes
  • Existing iterator tests cover the prefetch and Item recycling paths

🤖 Generated with Claude Code

Iterator.prefetch() with PrefetchValues=false uses prefetchSize=2, so
the first two newItem() calls on a fresh iterator always heap-allocate
before the per-iterator `waste` recycling kicks in. For workloads like
dgraph posting-list rollup (NewKeyIterator, AllVersions=true,
PrefetchValues=false) that re-create iterators per posting list, this
is 2 allocs/op of pure churn.

Embed [2]Item directly in the Iterator struct and push them onto
the waste pool at construction. The first two newItem() pops now
return these embedded slots; only iterators that demand more items
(PrefetchValues=true with PrefetchSize>2) fall back to allocating.

Iterator is already only used through *Iterator and never copied by
value, so the sync.WaitGroup inside Item is safe to embed. txn is
assigned to each prefilled item to mirror what newItem does for
heap-allocated items.

BenchmarkRollupKeyIterator (Apple M4 Max):
  before: 829.5 ns/op   770 B/op   13 allocs/op
  after:  ~800 ns/op    754 B/op   11 allocs/op

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@shaunpatterson shaunpatterson requested a review from a team as a code owner May 26, 2026 00:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

1 participant